DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

نویسندگان

چکیده

Monocular 3D detection has drawn much attention from the community due to its low cost and setup simplicity. It takes an RGB image as input predicts boxes in space. The most challenging sub-task lies instance depth estimation. Previous works usually use a direct estimation method. However, this paper we point out that on is non-intuitive. coupled by visual clues attribute clues, making it hard be directly learned network. Therefore, propose reformulate combination of surface (visual depth) (attribute depth). related objects’ appearances positions image. By contrast, relies inherent attributes, which are invariant object affine transformation Correspondingly, decouple location uncertainty into uncertainty. combining different types depths associated uncertainties, can obtain final depth. Furthermore, data augmentation monocular limited physical nature, hindering boost performance. Based proposed disentanglement strategy, alleviate problem. Evaluated KITTI, our method achieves new state-of-the-art results, extensive ablation studies validate effectiveness each component codes released at https://github.com/SPengLiang/DID-M3D .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monocular Object Detection Using 3D Geometric Primitives

Multiview object detection methods achieve robustness in adverse imaging conditions by exploiting projective consistency across views. In this paper, we present an algorithm that achieves performance comparable to multiview methods from a single camera by employing geometric primitives as proxies for the true 3D shape of objects, such as pedestrians or vehicles. Our key insight is that for a ca...

متن کامل

TemplateNet for Depth-Based Object Instance Recognition

We present a novel deep architecture termed templateNet for depth based object instance recognition. Using an intermediate template layer we exploit prior knowledge of an object’s shape to sparsify the feature maps. This has three advantages: (i) the network is better regularised resulting in structured filters; (ii) the sparse feature maps results in intuitive features been learnt which can be...

متن کامل

Sliding Shapes for 3D Object Detection in Depth Images

The depth information of RGB-D sensors has greatly simplified some common challenges in computer vision and enabled breakthroughs for several tasks. In this paper, we propose to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, selfocclusion and se...

متن کامل

Depth-assisted Real-time 3D Object Detection for Augmented Reality

In this paper, we propose a novel method of real-time object detection that can recognize three-dimensional (3D) target objects, regardless of their texture and lighting condition changes. Our method computes a set of reference templates of a target object from both RGB and depth images, which describes the texture and geometry of the object, and fuses them for robust detection. Combining both ...

متن کامل

Multiple Instance Boosting for Object Detection

A good image object detection algorithm is accurate, fast, and does not require exact locations of objects in a training set. We can create such an object detector by taking the architecture of the Viola-Jones detector cascade and training it with a new variant of boosting that we call MILBoost. MILBoost uses cost functions from the Multiple Instance Learning literature combined with the AnyBoo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-19769-7_5